Back

Genetics in Medicine Open

Elsevier BV

All preprints, ranked by how well they match Genetics in Medicine Open's content profile, based on 10 papers previously published here. The average preprint has a 0.00% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Generating Advancements in Longitudinal Analysis in X and Y Variations: Rationale, Design, and Methods for the GALAXY Registry

Carl, A.; Bothwell, S.; Swenson, K.; Cohen, L.; Cover, V.; Dawczyk, A.; Decker, G.; Gerken, S.; Hong, D.; Howell, S.; Raznahan, A.; Rogol, A.; Tartaglia, N.; Davis, S.

2024-08-14 genetic and genomic medicine 10.1101/2024.08.14.24311888 medRxiv
Top 0.1%
3.7%
Show abstract

Sex chromosome aneuploidies (SCAs) are a family of genetic disorders that result from an atypical number of X and/or Y chromosomes. SCAs are the most common chromosomal abnormality, affecting [~]1/400 live births, yet are often underdiagnosed, leading to over-representation of more severely impacted individuals in many clinical studies. In addition to this ascertainment bias, existing work in SCAs has also been limited by low geographic and demographic diversity. To address these limitations, we have created the Generating Advancements with Longitudinal Analysis in X and Y variations (GALAXY) Registry. To date, GALAXY has accrued 295 verified SCA participants. Next steps include targeted recruitment of minoritized individuals regarding race, ethnicity, and socioeconomic status, and continuing engagement with the SCA community.

2
Low-level mosaic variants causing the pancreatic disease congenital hyperinsulinism can be detected from blood DNA

Bennett, J. J.; Laver, T. W.; Mannisto, J. M. E.; Houghton, J. A. L.; De Franco, E.; Kalyon, O.; Wright, S.; Johnson, A.-M.; De Leon, D. D.; Globa, E.; Kummer, S.; Banerjee, I.; Dastamani, A.; International Congenital Hyperinsulinism Consortium, ; Wakeling, M. N.; Johnson, M. B.; Flanagan, S. E.

2026-01-15 genetic and genomic medicine 10.64898/2026.01.13.26344002 medRxiv
Top 0.1%
3.6%
Show abstract

A substantial proportion of individuals with a well-defined monogenic disorder remain without a genetic diagnosis. Low-level mosaic pathogenic variants are increasingly recognised as an underappreciated cause of monogenic disease but are technically challenging to detect, particularly in organ-specific conditions when affected tissue is inaccessible. We systematically investigated low-level mosaic variants in individuals with congenital hyperinsulinism (CHI: n=1,252) or neonatal diabetes (NDM: n=312), two opposing pancreatic disorders of insulin secretion. We screened for established pathogenic variants with variant allele fraction (VAF) <8% in dominant CHI (ABCC8, GCK, GLUD1, HK1) or dominant NDM (ABCC8, KCNJ11, INS) genes in targeted next generation sequencing (tNGS) data using Mutect2. This called 40 variants across the four genes in 39 individuals with CHI. No candidate variants were found in the NDM cohort. Orthogonal validation of 35 variants using TaqMan-based droplet digital PCR (ddPCR) confirmed 26/35 variants. The median VAF for confirmed variants was 3.6% (1.1-7.8%), while false positives (9/35) predominantly had a VAF <1% with some overlap in VAF with true positives. This study shows that disease-causing low-level mosaic variants in dominant CHI genes can be detected in blood using tNGS but require orthogonal validation. These results provide a framework to improve diagnostic yield in organ-specific conditions where mosaic variants may represent an important missed cause of disease.

3
Whole-Genome Sequencing is a Viable Replacement for Chromosomal Microarray and Fragile X PCR Testing

Gao, Y.; South, S. T.; Cain, C. C.; Cox, J. L.; Fleharty, M.; Hilton, B. A.; Larkin, K.; Li, G.; Marsh, D. W.; Popic, V.; Toydemir, R. M.; Hofherr, S.; Lennon, N.; Bui, P.

2025-05-25 genetic and genomic medicine 10.1101/2025.05.24.25328260 medRxiv
Top 0.1%
3.6%
Show abstract

Developmental disabilities and congenital anomalies are common pediatric conditions that often require extensive genetic testing to determine an underlying cause. Traditionally, chromosomal microarrays (CMA) and Fragile X testing have served as first-tier diagnostics, but these tests are limited in scope and often necessitate follow-up sequencing assays. Whole Genome Sequencing (WGS) offers a single, comprehensive assay capable of detecting a broad spectrum of genetic variation, including single nucleotide variants (SNVs), insertions and deletions (INDELs), copy number variants (CNVs), structural variants (SVs), loss of heterozygosity (LOH), and tandem repeat alterations. In this study, we evaluated whether WGS could replace CMA and serve as a more effective first-tier test. WGS achieved a 97.28% concordance with CMA for clinically relevant CNVs and LOH, while also offering more accurate breakpoint resolution and broader data point coverage. Notably, 4 out of 5 discordant cases (80%) were due to WGS providing more accurate breakpoint resolution. WGS covered over 97% of clinically relevant regions for CNV detection, compared to < 3% with CMA. To address the interpretive burden associated with the increased CNV calls, we implemented a cohort-based occurrence filter that successfully prioritized potential pathogenic events without sacrificing clinical sensitivity. Additionally, we assessed the feasibility of Fragile X screening from WGS data using a custom PCR confirmation logic built on Expansion Hunter output. This approach accurately excluded normal-range alleles and flagged indeterminate or expanded alleles for follow-up PCR confirmation. Our results support the use of WGS as a scalable and comprehensive diagnostic platform capable of consolidating multiple traditional assays. By streamlining workflows and enhancing clinical resolution, WGS offers a compelling alternative to the current diagnostic paradigm for patients with suspected genetic disorders.

4
Multisite Study of Optical Genome Mapping of Retrospective and Prospective Constitutional Disorder Cohorts

Broeckel, U.; Iqbal, M. A.; Levy, B.; Sahajpal, N.; Nagy, P. L.; Scharer, G.; Bossler, A. D.; Rodriguez, V.; Stence, A.; Skinner, C.; Skinner, S. A.; Kolhe, R.; Stevenson, R.

2022-12-31 genetic and genomic medicine 10.1101/2022.12.26.22283900 medRxiv
Top 0.1%
2.9%
Show abstract

Several medical societies including the American College of Medical Genetics and Genomics, the American Academy of Neurology, and the Association of Molecular Pathology recommend chromosomal microarray (CMA) as the first-tier test in the genetic work-up for individuals with neurodevelopmental disorders such as developmental delay and intellectual disability, autism spectrum disorder, as well as other disorders suspected to be of genetic etiology. Although CMA has significantly increased the diagnostic yield for these disorders, limitations in the technology preclude detection of certain structural variations in the genome and requires reflexing to other cytogenomic and molecular methods. Optical genome mapping (OGM) is a high-resolution technology that utilizes ultra-high molecular weight DNA, fluorescently labeled at a hexamer motif found throughout the genome, to create a barcode pattern, analogous to G-banded karyotyping, that can detect all classes of structural variations at very high resolution by comparison to a reference genome. A multisite study, partially published previously, with a total of n=1037 datapoints was conducted and showed 99.6% concordance between OGM and standard-of-care (SOC) testing for completed cases. The current phase of this study included cases from individuals with suspected genetic conditions referred for cytogenomic testing in a prospective postnatal cohort (79 cases with OGM and SOC results) and a retrospective postnatal cohort (262; same criteria). Among these cohorts were an autism spectrum disorder cohort (135) group with negative or uninformative results on previous testing (72). Prospective cases referred for CMA were included in this study as an unbiased comparison, OGM results show 100% concordance with variants of uncertain significance, pathogenic variants, and likely pathogenic variants reported by CMA other SOC and found reportable variants in an additional 10.1% of cases. Among the autism spectrum disorder cohort, OGM found reportable variants in an additional 14.8% of cases. Based on this demonstration of the analytic validity and clinical utility of OGM by this multi-site assessment, and considering clinical diagnostics often require iterative testing for detection and diagnosis in postnatal constitutional disorders, OGM should be considered as a first-tier test for neurodevelopmental disorders and/or suspicion of a genetic disease.

5
Clinical application of Complete Long Read genome sequencing identifies a 16kb intragenic duplication in EHMT1 in a patient with suspected Kleefstra syndrome

Gorzynski, J. E.; Marwaha, S.; Reuter, C. M.; Jensen, T.; Ferrasse, A.; Raja, A.; Fernandez, L.; Kravets, E.; Carter, J.; Bonner, D.; Sutton, S.; Undiagnosed Diseases Network (UDN), ; Ruzhnikov, M.; Hudgins, L.; Fisher, P. G.; Bernstein, J.; Wheeler, M. T.; Ashley, E. A.

2024-03-29 genetic and genomic medicine 10.1101/2024.03.28.24304304 medRxiv
Top 0.1%
2.8%
Show abstract

Long read sequencing offers benefits for the detection of structural variation in Mendelian disease. Here, we applied a new technology that generates contiguous long reads via tagmentation and sequencing by synthesis to a small cohort of patients with undiagnosed disease from the Undiagnosed Diseases Network. We first compare sequencing from the HG002 benchmark sample from Genome In A Bottle using nanopore sequencing (R10.4.1, duplex reads, Oxford Nanopore), single molecule real time sequencing (Revio SMRT cell, Pacific Biosciences) and complete long read sequencing (S4 flowcell, Novaseq, Illumina). Coverage was 33-35x across platforms. Read length N50 was 6.5kb (ICLR), 16.9kb (SMRT), and 33.8kb (ONT). We noted small differences in single nucleotide variant F1 scores across long read technologies with single nucleotide variant F1 scores (0.985-0.999) exceeding indel scores (0.78-0.99) and structural variant scores (0.74-0.96). We applied CLR sequencing to seven undiagnosed patients. In one patient, we detected and prioritized a novel 16kb intragenic duplication encompassing exons 5 and 6 in EHMT1. Resolution of the breakpoints and examination of flanking sequences revealed that the duplication was present in tandem and was predicted to result in a frameshift of the amino acid sequence and an early termination codon. It resulted in a diagnosis of Kleefstra syndrome. The variant was confirmed with targeted EHMT1 clinical testing and detected via nanopore and SMRT sequencing. In summary, we report the early clinical application of complete long read sequencing to a small cohort of undiagnosed patients.

6
Unifying the communities of early-onset glycogen storage disease type IV and adult polyglucosan body disease through a genetic prevalence study of GBE1-related disease

Koch, R. L.; Akman, H. O.; Chown, E.; Goldman, D.; Levenson, J.; Lu, Q.; Michalovicz Gill, L. T.; Morgan, M.; Orthmann-Murphy, J.; Pires, N. T.; Reef, R.; Saxe, H.; Singer-Berk, M.; Baxter, S.

2025-12-17 genetic and genomic medicine 10.64898/2025.12.16.25342386 medRxiv
Top 0.1%
2.6%
Show abstract

Glycogen storage disease type IV (GSD IV) is an autosomal recessive disorder caused by pathogenic variants in GBE1, resulting in deficient glycogen branching enzyme (GBE) activity and formation of abnormal glycogen ("polyglucosan"). GSD IV manifests across a spectrum of clinical dimensions - including hepatic, neurologic, muscular, and cardiac involvement - which vary in severity. The early-onset forms, historically referred to as Andersen disease, present at different stages ranging from in utero to adolescence. The adult-onset form, referred to as adult polyglucosan body disease (APBD), typically presents in middle to late adulthood. To date, no epidemiological study of GSD IV has been performed. Understanding the global prevalence of GSD IV is critical to increase disease awareness, improve diagnostic rates, inform therapeutic development, and engage pharmaceutical companies. In collaboration with the Rare Genomes Project at the Broad Institute of MIT and Harvard and the APBD Research Foundation, this study curated variants in GBE1 and calculated prevalence across nine genetic ancestry groups. The estimated global carrier frequency of GSD IV is 1 in 243 individuals, and the global genetic prevalence is 1 in 235,784 individuals. Based on the 2024 world population, the estimated number of affected individuals with GSD IV is approximately 34,800. These estimates highlight a significant underdiagnosis of GSD IV and underscore the urgent need for increased awareness of this metabolic disorder. This model of collaboration between researchers, patient advocacy organizations, and genetic data sharing programs provides a framework for estimating the prevalence of other rare diseases in the global population. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=180 HEIGHT=200 SRC="FIGDIR/small/25342386v1_ufig1.gif" ALT="Figure 1"> View larger version (49K): org.highwire.dtl.DTLVardef@1a1ad7dorg.highwire.dtl.DTLVardef@1851576org.highwire.dtl.DTLVardef@442c19org.highwire.dtl.DTLVardef@1ab2ddb_HPS_FORMAT_FIGEXP M_FIG Created in BioRender. Koch, R. (2025) https://BioRender.com/j0sg30n. C_FIG

7
Long-read sequencing with targeted assembly of the opsin locus accurately evaluates genes in expressed positions

Anderson, Z. B.; Prall, T.; Damaraju, N.; Storz, S. H.; Goffena, J.; Miller, A. L.; Carroll, J.; Neitz, M.; Miller, D. E.

2026-03-19 genetic and genomic medicine 10.64898/2026.03.17.26348636 medRxiv
Top 0.1%
2.6%
Show abstract

The human opsin gene cluster at Xq28 contains highly similar OPN1LW and OPN1MW genes essential for red-green color vision. Current molecular methods cannot accurately analyze this complex locus, limiting diagnosis of color vision deficiencies (CVD) and detection of carrier status. We performed Nanopore long-read sequencing of 206 individuals, comparing alignment-based analysis with targeted de novo assembly. Alignment-based methods performed poorly, whereas targeted assembly achieved 99% concordance for OPN1LW and 92% for OPN1MW copy numbers and resolved gene order in all XY individuals and 87% of XX individuals. This approach detected CVD in 3.2% of XY individuals and identified 8% of XX individuals as carriers, consistent with population estimates. Moreover, it molecularly explained the phenotypic severity in a family with Bornholm eye disease and clarified carrier status in an XX individual suspected of carrying two CVD haplotypes. Our approach provides a comprehensive, reference-free method for accurate analysis of expressed opsin genes and reliable CVD carrier detection.

8
Advancing precision care in pregnancy through an actionable fetal findings list

Cohen, J. L.; Duyzend, M.; Adelson, S. M.; Yeo, J.; Fleming, M.; Ganetzky, R. D.; Hale, R.; Mitchell, D. M.; Morton, S. U.; Reimers, R.; Roberts, A. E.; Strong, A.; Tan, W.; Thiagarajah, J. R.; Walker, M. A.; Green, R. C.; Gold, N. B.

2024-09-30 genetic and genomic medicine 10.1101/2024.09.26.24314442 medRxiv
Top 0.1%
2.6%
Show abstract

The use of genomic sequencing (GS) for prenatal diagnosis of fetuses with sonographic abnormalities has grown tremendously over the past decade. Fetal GS also offers an opportunity to identify incidental genomic variants that are unrelated to the fetal phenotype, but may be relevant to fetal and newborn health. There are currently no guidelines for reporting incidental findings from fetal GS. In the United States, GS for adults and children is recommended to include a list of "secondary findings" genes (ACMG SF v3.2) that are associated with disorders for which surveillance or treatment can reduce morbidity and mortality. The genes on ACMG SF v3.2 predominantly cause adult-onset disorders. Importantly, many genetic disorders with fetal and infantile onset are actionable as well. A proposed solution is to create a "fetal actionable findings list," which can be offered to pregnant patients undergoing fetal GS or eventually, as a standalone cell-free fetal DNA screening test. In this integrative review, we propose criteria for an actionable fetal findings list, then identify genetic disorders with clinically available or emerging fetal therapies, and those for which clinical detection in the first week of life might lead to improved outcomes. Finally, we synthesize the potential benefits, limitations, and risks of an actionable fetal findings list.

9
Targeted long-read sequencing resolves complex structural variants and identifies missing disease-causing variants

Miller, D. E.; Sulovari, A.; Wang, T.; Loucks, H.; Hoekzema, K.; Munson, K. M.; Lewis, A. P.; Almanza Fuerte, E. P.; Paschal, C. R.; Thies, J.; Bennett, J. T.; Glass, I.; Dipple, K. M.; Patterson, K.; Bonkowski, E. S.; Nelson, Z.; Squire, A.; Sikes, M.; Beckman, E.; Bennett, R. L.; Earl, D.; Lee, W.; Allikmets, R.; Perlman, S. J.; Chow, P.; Hing, A. V.; Adam, M. P.; Sun, A.; Lam, C.; Chang, I.; University of Washington Center for Mendelian Genomics, ; Cherry, T.; Chong, J. X.; Bamshad, M. J.; Nickerson, D. A.; Mefford, H. C.; Doherty, D.; Eichler, E. E.

2020-11-04 genomics 10.1101/2020.11.03.365395 medRxiv
Top 0.1%
2.6%
Show abstract

BACKGROUNDDespite widespread availability of clinical genetic testing, many individuals with suspected genetic conditions do not have a precise diagnosis. This limits their opportunity to take advantage of state-of-the-art treatments. In such instances, testing sometimes reveals difficult-to-evaluate complex structural differences, candidate variants that do not fully explain the phenotype, single pathogenic variants in recessive disorders, or no variants in specific genes of interest. Thus, there is a need for better tools to identify a precise genetic diagnosis in individuals when conventional testing approaches have been exhausted. METHODSTargeted long-read sequencing (T-LRS) was performed on 33 individuals using Read Until on the Oxford Nanopore platform. This method allowed us to computationally target up to 100 Mbp of sequence per experiment, resulting in an average of 20x coverage of target regions, a 500% increase over background. We analyzed patient DNA for pathogenic substitutions, structural variants, and methylation differences using a single data source. RESULTSThe effectiveness of T-LRS was validated by detecting all genomic aberrations, including single-nucleotide variants, copy number changes, repeat expansions, and methylation differences, previously identified by prior clinical testing. In 6/7 individuals who had complex structural rearrangements, T-LRS enabled more precise resolution of the mutation, which led, in one case, to a change in clinical management. In nine individuals with suspected Mendelian conditions who lacked a precise genetic diagnosis, T-LRS identified pathogenic or likely pathogenic variants in five and variants of uncertain significance in two others. CONCLUSIONST-LRS can accurately predict pathogenic copy number variants and triplet repeat expansions, resolve complex rearrangements, and identify single-nucleotide variants not detected by other technologies, including short-read sequencing. T-LRS represents an efficient and cost-effective strategy to evaluate high-priority candidate genes and regions or to further evaluate complex clinical testing results. The application of T-LRS will likely increase the diagnostic rate of rare disorders.

10
Metabolomic and transcriptomic signature in Kabuki syndrome

Jung, Y. L.; Hung, C.; Choi, J.; Lee, E. A.; Bodamer, O.

2025-04-30 genetic and genomic medicine 10.1101/2025.04.30.25326738 medRxiv
Top 0.1%
2.6%
Show abstract

Kabuki Syndrome (KS) is a rare multisystem disorder with a variable clinical phenotype. The majority of KS cases are caused by dominant loss-of-function mutations in two genes, KMT2D (lysine methyltransferase 2D, KS1) and KDM6A (lysine demethylase 6A, KS2). Both KMT2D and KDM6A play a critical role in chromatin accessibility, which is essential for developmental processes and differentiation. In a previous study, we reported that KMT2D mutations could lead to increased enhancer activity in genes related to metabolomic pathways in KS1. Early detection of KS is crucial in order to offer improved treatment options. To uncover new biomarkers that could facilitate early detection and to inform clinical trial readiness, we conducted a study in which we collected and analyzed plasma and urine metabolites from 40 KS patients with pathogenic mutations in either KMT2D or KDM6A and 12 healthy controls. We employed an untargeted approach using Liquid Chromatography with tandem Mass Spectrometry (LC-MS/MS). Additionally, we profiled gene expression in the most KS patients and controls. Our analysis revealed > 100 significantly altered metabolites between KS patients and controls, with these metabolites being clustered based on genotypes. Importantly, we identified N2, N2-dimethylguanosine emerging as one of the top candidates in both KS1 and KS2 patients. We utilized machine learning classifiers and identified the most crucial metabolites. Using this trained model, we achieved a high level of discrimination between the KS data and controls. Furthermore, pathway analysis revealed several disrupted pathways, including the pyrimidine metabolism pathway, which are associated with the significantly altered in both metabolome and transcriptome in KS. Distinctive metabolites identified in KS can effectively serve as discriminative biomarkers. Our findings provide valuable insights into the metabolic dysregulation underlying KS and highlight potential targets for further investigation and therapeutic interventions.

11
The Refined Recurrence Risk of De Novo variants Due to Parental Mosaicism

Neuser, S.; Ahmad, N.; Popp, D.; Hentschel, J.; Lemke, J. R.; Krohn, K.; Schween, A.-L.; Karnstedt, M.; Drukewitz, S.; Wiedenhoeft, J.; Abou Jamra, R.

2025-07-02 genetic and genomic medicine 10.1101/2025.07.02.25330538 medRxiv
Top 0.1%
2.6%
Show abstract

Parents of children with genetic disorders due to de novo variants are counselled on a recurrence risk estimate of 1-5% for further affected siblings, while the actual probability varies between 0 and 50%. This discrepancy is well known, but barely investigated. We enrolled 135 families, in which a child had been previously identified with a pathogenic seemingly de novo variant (in 140 genes). Covering two germ layers, we collected blood (n=269), buccal (n=223) and nail samples (n=223) of both parents and paternal semen (n=88). Using Nanopore long read sequencing, variants were phased to the parental allele of origin. We performed deep sequencing with unique molecular identifiers of all samples. We investigated for low mosaicism with plotting identified allele fractions and Bayesian analyses. We confirmed the results individually using amplicon-based, spiked-in deep sequencing. Phasing revealed 22% of variants have occurred on the maternal and 78% on the paternal allele. Median raw target read depth achieved >7,000x, reduced to 523x after collapsing across all tissues (n{approx}130,000 measured values). We identified mosaicism in either the mother (n=2) or the father (n=4): four of them as mixed mosaicism, two cases as paternal gonadal mosaicism. The alternative allele fraction varied from 1.1% to 23%, while the results of the Bayesian model correlated well with amplicon-based sequencing. With 4.4%, we observe a slightly lower number of parental mosaicism compared to the literature. We now apply the amplicon-based sequencing of tissue samples - including semen - to routine counselling of parents with an affected child for individual risk assessment.

12
Long-Read Sequencing Increases Diagnostic Yield for Pediatric Sensorineural Hearing Loss

Redfield, S. E.; Shao, W.; Sun, T.; Pastolero, A.; Rowell, W. J.; French, C. E.; Nolan, C.; Holt, J. M.; Saunders, C. T.; Fanslow, C.; Lampraki, E. M.; Lambert, C.; Kenna, M.; Eberle, M.; Rockowitz, S.; Shearer, A. E.

2024-09-30 genetic and genomic medicine 10.1101/2024.09.30.24314377 medRxiv
Top 0.1%
2.4%
Show abstract

The diagnostic yield of genetic testing for pediatric sensorineural hearing loss (SNHL) has remained at around 40% for over a decade despite newly discovered causative genes and the expanded use of exome sequencing (ES). This stagnation may be due to (1) a focus on coding regions of the genome and (2) an inability to resolve variants in complex genomic regions due to reliance on short-read sequencing technologies. Short-read genome sequencing (srGS) and long-read genome sequencing (lrGS) both provide exonic single nucleotide variant (SNV) and small indel detection at the same sensitivity as ES, but also evaluate intronic regions. lrGS provides improved resolution for structural variants (SV) and repetitive genomic regions. We sought to investigate the potential utility of lrGS in the diagnostic evaluation of a small cohort of patients with SNHL of unknown etiology after ES and srGS. 19 pediatric patients with SNHL underwent lrGS via PacBio SMRT sequencing. Sequencing data were processed using the PacBio WGS variant pipeline. The diagnostic yield for this lrGS cohort was 4/19 (21%). Relevant variants detected only with lrGS included a hemizygous deletion in trans with a missense variant in an area of high genomic homology (OTOA) and two single nucleotide loss-of-function variants in trans to a known copy-number-loss for a gene with a highly homologous pseudogene (STRC). A complex inversion was identified in the MITF gene which was also identified on post-hoc analysis by srGS. LrGS provides improved resolution for complex genomic structural variation which may increase diagnostic yield for genetic pediatric SNHL, and, potentially, rare disease more broadly.

13
A Systematic Re-Analysis Of Copy Number Losses Of Uncertain Clinical Significance

Burghel, G.; Ellingford, J. M.; Wright, R.; Bradford, L.; Miller, J.; Watt, C.; Edgerley, J.; Naeem, F.; Banka, S.

2023-07-11 genetic and genomic medicine 10.1101/2023.07.02.23292123 medRxiv
Top 0.1%
2.4%
Show abstract

BackgroundRe-analysis of whole exome/genome data improves diagnostic yield. However, the value of re-analysis of clinical array comparative genomic hybridisation (aCGH) data has never been investigated. Case-by-case re-analysis is impractical in busy diagnostic laboratories. Methods and ResultsWe harmonised historical post-natal clinical aCGH results from [~]16,000 patients tested via our diagnostic laboratory over [~]7 years with current clinical guidance. This led to identification of 33,857 benign, 2,173 class 3, and 979 pathogenic copy number losses (CNLs). We found benign CNLs to be significantly less likely to encompass haploinsufficient genes compared to the pathogenic or class 3 CNLs in our database. Using this observation, we developed a re-analysis pipeline (using up-to-date disease association data and haploinsufficiency scores) and shortlisted 207 class 3 CNLs encompassing at least one autosomal dominant disease-gene associated with haploinsufficiency or loss-of-function mechanism. Clinical scientist review led to reclassification of 7.2% shortlisted class 3 CNLs as pathogenic or likely pathogenic. This included first cases of CNV-mediated disease for some genes where all previously described cases involved only point variants. Interestingly, some CNLs could not be re-classified because the phenotypes of patients with CNLs seemed distinct from the known clinical features resulting from point variants, thus raising questions about accepted underlying disease mechanisms. Several potential novel disease-genes were identified that would need further validation. ConclusionsRe-analysis of clinical aCGH data increases diagnostic yield and demonstrates their research value. In future, the aCGH reanalysis program should be expanded to include other copy number variant types.

14
High-Resolution Variant Profiling of CAH-Associated Genes Using a Long-Read Sequencing Assay

Gu, S.; Chu, G.; Cai, R.; Sun, X.; Lu, Q.; Han, W.; Deng, S.; Wang, X.; Xiang, J.; He, R.

2025-10-27 genetic and genomic medicine 10.1101/2025.10.24.25338696 medRxiv
Top 0.1%
2.1%
Show abstract

BackgroundCongenital adrenal hyperplasia (CAH) is a group of autosomal recessive disorders primarily caused by mutations in CYP21A2 and CYP11B1. However, accurate genetic diagnosis remains challenging due to the high sequence homology between these genes and their corresponding homologs (CYP21A1P, CYP11B2). MethodsWe developed NanoCAH (Nanopore-based Comprehensive Analysis of CAH), a targeted long-read sequencing assay based on the Cyclone nanopore platform, CycloneSEQ G100-ER. This approach uses gene-specific long-range PCR to amplify CYP21A2 and CYP11B1, enabling detection of all classes of variants including single nucleotide variants (SNVs), copy number variants (CNVs), and gene conversions. ResultsA total of 59 samples were analyzed by NanoCAH, including 26 singletons and 11 trios. Among these samples, NanoCAH successfully detected all 85 variants previously identified by conventional MLPA and Sanger sequencing (100%, 85/85), including SNVs/Indels, deletions and duplications. Importantly, seven Exon 8 deletions and one Exon 8 duplication undetectable by MLPA were identified by NanoCAH assay. In addition, NanoCAH resolved a novel heterozygous 111-bp in-exon tandem duplication in CYP21A2 (c.65_175dup). NanoCAH further accurately resolved haplotypes in twenty cases without parental data, identifying 18 variants in trans and 2 in cis. These results demonstrate NanoCAHs robust capacity to detect complex variant types and provide precise haplotype resolution in a single assay. ConclusionsNanoCAH provides an accurate, cost-effective, and scalable solution for comprehensive CAH genotyping. Its ability to detect complex variant types and resolve haplotypes in a single assay highlights its potential as a first-line diagnostic tool in clinical genetics.

15
Medical Evaluation of Unanticipated Monogenic Disease Risks Identified through Newborn Genomic Screening: Findings from the BabySeq Project

Green, R. C.; Shah, N.; Genetti, C.; Yu, T. W.; Zettler, B.; Schwartz, T.; Uveges, M.; Ceyhan-Birsoy, O.; Lebo, M.; Pereira, S.; Agrawal, P.; Parad, R.; McGuire, A.; Christensen, K.; Rehm, H. L.; Holm, I.; Beggs, A.

2022-03-18 genetic and genomic medicine 10.1101/2022.03.18.22272284 medRxiv
Top 0.1%
2.1%
Show abstract

Genomic sequencing of healthy newborns to screen for medically important genetic information has long been anticipated but data around downstream medical consequences are lacking. Among 159 infants randomized to the sequencing arm in the BabySeq Project, an unanticipated monogenic disease risk (uMDR) was discovered in 18 (11.3%). We assessed uMDR actionability by visualizing scores from a modified ClinGen Actionability SemiQuantitative Metric and tracked medical outcomes in these infants for 3-5 years. All uMDRs scored as highly actionable (mean 9, range: 7-11 on a 0-12 scale) and had readily available clinical interventions. In 4 cases, uMDRs revealed unsuspected genetic etiologies for existing phenotypes, and in the remaining 14 cases provided risk stratification for future surveillance. In 8 cases, uMDRs prompted screening for multiple at-risk family members. These results suggest that actionable uMDRs are more common than previously thought and support ongoing efforts to evaluate population-based newborn genomic screening.

16
Systematic assessment of rare and de novo structural variants in 57 patient-parent trios using optical genome mapping

van der Sanden, B.; Vorimo, S.; Brunet, T.; Boughalem, A.; Jacob, M.; van Beek, R.; Kamping, E.; Rahikkala, E.; Kuismin, O.; Moilanen, J.; Pylkas, K.; Graf, E.; Loesecke, S.; Brugger, M.; Derderian, K.; Schatz, U.; Wagner, M.; Zech, M.; Schwaibold, E. M. C.; Distelmaier, F.; Borggraefe, I.; Vill, K.; Vissers, L. E. L. M.; Winkelmann, J.; Neveling, K.; Meitinger, T.; Mantere, T.; Trost, D.; Hoischen, A.

2026-01-21 genetic and genomic medicine 10.64898/2026.01.16.26344264 medRxiv
Top 0.1%
2.1%
Show abstract

Next-generation sequencing has unraveled the genetic cause for many individuals with a rare disease, but a significant number of individuals remain undiagnosed using standard of care tests. It is anticipated that structural variants (SVs) have not been fully assessed in this context. Here, we performed optical genome mapping (OGM) for 57 trios and prioritized SVs using a two-step approach. First, we systematically identified all de novo SVs, and subsequently we studied all rare inherited SVs. Potential pathogenic SVs were confirmed using orthogonal methods. On average, we identified 6,289 SVs >500bp per proband, primarily insertions (69.8%) and deletions (27.1%). In total, we identified 13 de novo SVs, confirming a de novo mutation rate for large SVs of 0.23 or 1 in 4-5 cases. These de novo SVs impacted multiple (candidate) disease-associated genes, including NSF and FGF9. Additionally, on average per sample, we identified 11 rare inherited SVs overlapping with an established OMIM disease gene or its regulatory region, including a homozygous deletion affecting SCN9A causing congenital indifference to pain, a maternally inherited deletion in WWOX causing developmental and epileptic encephalopathy, and an interchromosomal insertion in the CMTX3 locus at Xq27.1 causing X-linked Charcot-Marie-Tooth disease. In total, we identified pathogenic SVs in three individuals and candidate disease-causing SVs in five other individuals. Overall, OGM enabled the accurate detection of challenging de novo and rare inherited SVs. Our results suggest a potential yield of disease-associated SVs in 5-14% of index cases, demonstrating that OGM can unravel previously hidden SVs in extensively tested individuals.

17
RAPID: A Targeted Long-Read RNA Workflow for Functional Resolution of Splicing Variants in Rare Disease

Montgomery, K.-a.; Macpherson, H.; Anderson, C.; Wade, C.; Gustavsson, E. K.; Lynch, D. S.; Wilson, L. C.; Davison, J.; Wakeling, E.; Tuschl, K.; Houlden, H.; Clement, E.; Mills, P.; Ryten, M.

2026-01-02 genetic and genomic medicine 10.64898/2025.12.30.25342835 medRxiv
Top 0.1%
2.0%
Show abstract

BackgroundMolecular diagnosis of rare disease plateaus at [~]50%, partly due to technical limitations of short-read sequencing and the persistent challenge of interpreting variants of uncertain significance (VUS). Splice-altering variation represents a major source of unresolved cases, yet functional assessment remains difficult in routine practice. MethodsWe developed a fully modular, sample-to-answer workflow for targeted long-read RNA sequencing (lrRNA-seq) using Oxford Nanopore Technologies and applied it to six unsolved cases with suspected monogenic neurometabolic disease. Candidates were selected after WES/WGS and multidisciplinary team review (MDT) indicating [&le;]5 genes of interest. The workflow was designed to be diagnostically deployable, enabling near-full-length transcript assessment from accessible tissues without reliance on large control cohorts. ResultslrRNA-seq yielded actionable findings for all six probands. It confirmed pathogenic splice disruption in two cases, prompted gene exclusion in one case, and generated RNA-level evidence prioritising further DNA investigation in three cases. Across these scenarios, lrRNA-seq provided direct, mechanism-level insight that either resolved diagnosis or refined variant interpretation. The workflow provided near-full-length isoform structures with reproducible single-sample interpretation and produced informative results within two working days at <{pound}500 per sample reagent cost. ConclusionTargeted lrRNA-seq offers rapid, cost-effective functional evidence to resolve VUS, direct DNA follow-up, and support timely diagnosis in rare disease. The RAPID workflow demonstrates that long-read RNA sequencing can be implemented within existing diagnostic infrastructure and provides a scalable route to routine transcript-level assessment in clinical genomics.

18
Non-coding variants are a rare cause of recessive developmental disorders in trans with coding variants

Lord, J.; Jaramillo Oquendo, C.; Martin-Geary, A. C.; Blakes, A. J.; Arciero, E.; Domcke, S.; Childs, A.-M.; Low, K.; Rankin, J.; Genomics England Research Consortium, ; Baralle, D.; Martin, H. C.; Whiffin, N.

2023-06-29 genetic and genomic medicine 10.1101/2023.06.23.23291805 medRxiv
Top 0.1%
2.0%
Show abstract

PurposeIdentifying pathogenic non-coding variants in individuals with developmental disorders (DD) is challenging due to the large search space. It is common to find a single protein-altering variant in a recessive gene in DD patients, but the prevalence of pathogenic non-coding ;second hits; in trans with these is unknown. MethodsIn 4,073 genetically undiagnosed rare disease trio probands from the 100,000 Genomes project, we identified rare heterozygous loss-of-function (LoF) or ClinVar pathogenic variants in recessive DD-associated genes. Using stringent region-specific filtering, we identified rare non-coding variants on the other haplotype. Identified genes were clinically evaluated for phenotypic fit, and where possible, we performed functional testing using RNA-sequencing. ResultsWe found 2,430 probands with one or more rare heterozygous pLoF or ClinVar pathogenic variants in recessive DD-associated genes, for a total of 3,761 proband-variant pairs. For 1,366 (36.3%) of these pairs, we identified at least one rare non-coding variant in trans. After stringent bioinformatic filtering and clinical review, five were determined to be a good clinical fit (in ALMS1, NPHP3, LAMA2, IGHMBP2 and GAA). ConclusionWe developed a pipeline to systematically identify and annotate compound heterozygous coding/non-coding genotypes. Using this approach we uncovered new diagnoses and conclude that this mechanism is a rare cause of DDs.

19
Multi-site Technical Performance and Concordance of Optical Genome Mapping: Constitutional Postnatal Study for SV, CNV, and Repeat Array Analysis

Iqbal, M. A.; Broeckel, U.; Levy, B.; Skinner, S.; Sahajpal, N. S.; Rodriguez, V.; Stence, A.; Awayda, K.; Scharer, G.; Skinner, C.; Stevenson, R.; Bossler, A.; Nagy, P. L.; Kolhe, R.

2021-12-30 genetic and genomic medicine 10.1101/2021.12.27.21268432 medRxiv
Top 0.1%
1.9%
Show abstract

BackgroundThe standard of care (SOC) cytogenetic testing methods, such as chromosomal microarray (CMA) and Fragile-X syndrome (FXS) testing, have been employed for the detection of copy number variations (CNVs), and tandem repeat expansions/contractions that contribute towards a sizable portion of genetic abnormalities in constitutional disorders. However, CMA is unable to detect balanced structural variations (SVs) or determine the precise location or orientation of copy number gains. Karyotyping, albeit with lower resolution, has been used for the detection of balanced SVs. Other molecular methods such as PCR and Southern blotting, either simultaneously or in a tiered fashion have been used for FXS testing, adding time, cost, and complexity to reach an accurate diagnosis in affected individuals. Optical genome mapping (OGM), innovative technology in the cytogenomics arena enables a direct, high-resolution view of ultra-long DNA molecules (more than 150 kbp), which are then assembled de novo to detect germline SVs ranging from 500 bp insertions and deletions to complex chromosomal rearrangements. The purpose of this study was to evaluate the performance of OGM in comparison to the current SOC methods and assess the intra- and inter-site reproducibility of the OGM technique. We report the largest retrospective dataset to date on OGM performed at five laboratories (multi-site) to assess the robustness, QC performance, and analytical validation (multi-operator, and multi-instrument) in detecting SVs and CNVs associated with constitutional disorders compared to SOC technologies. MethodsThis multi-center IRB-approved, double-blinded, study includes a total of 331 independent flow cells run (including replicates), representing 202 unique retrospective samples, including but not limited to pediatric-onset neurodevelopmental disorders. This study included affected individuals with either a known genetic abnormality or no known genetic diagnosis. Control samples (n=42) were also included. Briefly, OGM was performed on either peripheral blood samples or cell lines using the Saphyr system. The OGM assay results were compared to the human reference genome (GRCh38) to detect different types of SVs (CNV, insertions, inversions, translocations). A unique coverage-based CNV calling algorithm was also used to complement the SV calls. Analysis of heterozygous SVs was performed to assess the absence of heterozygosity (AOH) regions in the genome. For specific clinical indications of FSHD1 and FXS, the EnFocus FXS and FSHD1 tools were used to generate the region-specific reports. OGM data was analyzed and visualized using Access software (version 1.7), where the SVs were filtered using an OGM specific internal control database. The samples were analyzed by laboratory analysts at each site in a blinded fashion using ACMG guidelines for SV interpretation and further reviewed by expert geneticists to assess concordance with SOC testing results. ResultsOf the first 331 samples run between five sites, 99.1% of sample runs were completed successfully. Of the 331 datasets, 219 were assessed for concordance by the time of this publication; these were samples that harbored known variants, of which 214/219 were detected by OGM resulting in a concordance of 97.7% compared to SOC testing. 47 samples were also run in intra- and inter-site replicate and showed 100% concordance for pathogenic CNVs and SVs and 100% concordance for pathogenic FMR1 repeat expansions. ConclusionThe results from this study demonstrate the potential of OGM as an alternative to existing SOC methods in detecting SVs of clinical significance in constitutional postnatal genetic disorders. The outstanding technical performance of OGM across multiple sites demonstrates the robustness and reproducibility of the OGM technique as a rapid cytogenomics testing tool. Notably, OGM detected all classes of SVs in a single assay, which allows for a faster result in cases with diverse and heterogeneous clinical presentations. OGM demonstrated 100% concordance for pathogenic variants previously identified including FMR1 repeat expansions (full mutation range), pathogenic D4Z4 repeat contractions (FSHD1 cases), aneuploidies, interstitial deletions, interstitial duplications, intragenic deletions, balanced translocations, and inversions. Based on our large dataset and high technical performance we recommend OGM as an alternative to the existing SOC tests for the rapid detection and diagnosis of postnatal constitutional disorders.

20
Topologically Associating Domain Boundaries are Commonly Required for Normal Genome Function

Rajderkar, S.; Barozzi, I.; Zhu, Y.; Hu, R.; Zhang, Y.; Li, B.; Fukuda-Yuzawa, Y.; Kelman, G.; Akeza, A.; Blow, M. J.; Pham, Q.; Harrington, A. N.; Godoy, J.; Meky, E. M.; von Maydell, K.; Novak, C. S.; Plajzer-Frick, I.; Afzal, V.; Tran, S.; Talkowski, M. E.; Llyod, K. C. K.; Ren, B.; Dickel, D. E.; Visel, A.; Pennacchio, L. A.

2021-05-07 genomics 10.1101/2021.05.06.443037 medRxiv
Top 0.1%
1.9%
Show abstract

Topologically associating domain (TAD) boundaries are thought to partition the genome into distinct regulatory territories. Anecdotal evidence suggests that their disruption may interfere with normal gene expression and cause disease phenotype1-3, but the overall extent to which this occurs remains unknown. Here we show that TAD boundary deletions commonly disrupt normal genome function in vivo. We used CRISPR genome editing in mice to individually delete eight TAD boundaries (11-80kb in size) from the genome in mice. All deletions examined resulted in at least one detectable molecular or organismal phenotype, which included altered chromatin interactions or gene expression, reduced viability, and anatomical phenotypes. For 5 of 8 (62%) loci examined, boundary deletions were associated with increased embryonic lethality or other developmental phenotypes. For example, a TAD boundary deletion near Smad3/Smad6 caused complete embryonic lethality, while a deletion near Tbx5/Lhx5 resulted in a severe lung malformation. Our findings demonstrate the importance of TAD boundary sequences for in vivo genome function and suggest that noncoding deletions affecting TAD boundaries should be carefully considered for potential pathogenicity in clinical genetics screening.